AITopics | world model

Collaborating Authors

world model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Interactive World Simulator for Robot Policy Training and Evaluation

AIHubJul-17-2026, 16:48:48 GMT

Imagine you want to teach a robot to push an object on a table. The standard recipe in robot learning is to collect hundreds of expert demonstrations on a real robot, train an imitation learning policy on that data, and then evaluate the policy by running it many times on the same real robot. Both stages (data collection and evaluation) are slow, expensive, and hard to reproduce: hardware breaks, lighting changes, objects drift out of place, and every new task means more hours in the lab. A natural question is whether we can replace some of this real-robot work with a simulator. Classical physics-based simulators are powerful, but building one for a new task means manually modeling geometries, contacts, friction, and deformation, and the resulting simulator often still does not match reality closely enough for policies trained inside it to transfer.

artificial intelligence, simulator, wp-content upload 2026 07, (13 more...)

AIHub

Country: Asia > South Korea (0.15)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

The Download: Claude's inner workings, and the future of world models

MIT Technology ReviewJul-14-2026, 12:10:00 GMT

Plus: New York has become the first state to enact a data center moratorium. When Anthropic announced last week that it had found a new window into its models' "internal thoughts" as they reason through answers, there was one colleague I had to talk to: senior editor Will Douglas Heaven. Aside from having a PhD in computer science, Will has spent a lot of time digging into what we can say about how AI models work. I spoke with him about what we should take from Anthropic's new (and typically quirky) research. Here's what he had to say . How will AI understand the real world?

artificial intelligence, large language model, natural language, (18 more...)

MIT Technology Review

Country: North America > United States > New York (0.26)

Industry: Information Technology > Security & Privacy (0.30)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.42)

Add feedback

The Download: a donor conception cap and world models for AI

MIT Technology ReviewJul-13-2026, 12:10:00 GMT

Plus: Apple has sued OpenAI for allegedly stealing trade secrets. Ties van der Meer doesn't know how many siblings he has. The 47-year-old was conceived at a private fertility clinic using sperm from an anonymous donor. He eventually tracked down one sibling, but he may have others he'll never find. Other donor-conceived people have found they have tens or even hundreds of them. "It does make you feel a bit mass-produced," said one who discovered they had 25 half-siblings.

artificial intelligence, large language model, natural language, (17 more...)

MIT Technology Review

Country: North America > United States (0.48)

Industry: Law > Intellectual Property & Technology Law (0.57)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.42)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.38)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.37)

Add feedback

AI is 'not smart' so what's next in artificial intelligence?

BBC NewsJul-2-2026, 23:02:09 GMT

AI is'not smart' so what's next in artificial intelligence? We don't have robots that are nearly as good at understanding the physical world as a rat, says Yann LeCun, one of the leading figures in the world of artificial intelligence. He worked at Facebook-owner, Meta, for a decade, where he was chief AI scientist, but left in 2025 and founded Advanced Machine Intelligence Labs (AMI Labs). His goal is to move AI beyond current systems like ChatGPT, Claude and Gemini. They have their uses, he says, but will never be able to tackle complicated situations in the real world, like getting a robot to do household chores.

large language model, machine learning, natural language, (14 more...)

BBC News

Country:

North America > United States (0.49)
Europe > United Kingdom > England (0.31)

Industry:

Leisure & Entertainment (0.72)
Government (0.50)
Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (0.91)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

PoE World Compositional World Modeling with Products of Experts

Neural Information Processing SystemsJun-23-2026, 06:08:14 GMT

Learning how the world works is central to building AI agents that can adapt to complex environments. Traditional world models based on deep learning demand vast amounts of training data, and do not flexibly update their knowledge from sparse observations. Recent advances in program synthesis using Large Language Models (LLMs) give an alternate approach which learns world models represented as source code, supporting strong generalization from little data. To date, application of program-structured world models remains limited to natural language and grid-world domains. We introduce a novel program synthesis method for effectively modeling complex, non-gridworld domains by representing a world model as an exponentially-weighted product of programmatic experts (PoE-World) synthesized by LLMs. We show that this approach can learn complex, stochastic world models from just a few observations. We evaluate the learned world models by embedding them in a model-based planning agent, demonstrating efficient performance and generalization to unseen levels on Atari's Pong and Montezuma's Revenge.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry:

Education (1.00)
Health & Medicine (0.92)
Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Image as a World: Generating Interactive World from Single Image via Panoramic Video Generation

Neural Information Processing SystemsJun-23-2026, 04:00:09 GMT

Generating an interactive visual world from a single image is both challenging and practically valuable, as single-view inputs are easy to acquire and align well with prompt-driven applications such as gaming and virtual reality. This paper introduces a novel unified framework, Image as a World (IaaW), which synthesizes high-quality 360-degree videos from a single image that are both controllable and temporally continuable.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Reliable World Simulation for Autonomous Driving

Neural Information Processing SystemsJun-23-2026, 03:01:08 GMT

How can we reliably simulate future driving scenarios under a wide range of ego driving behaviors? Recent driving world models, developed exclusively on real-world driving data with expert trajectories, struggle to represent hazardous or non-expert behaviors that are rare in training corpus. This limitation restricts their applicability to tasks such as policy evaluation. In this work, we address this challenge by enriching real-world human demonstrations with diverse non-expert data collected from a driving simulator (e.g., CARLA), and building a controllable world model trained on this heterogeneous corpus. Starting with a video generator featuring a diffusion transformer architecture, we devise several strategies to effectively integrate conditioning signals and improve prediction controllability and fidelity. The resulting model, ReSim, enables Reliable Simulation of diverse openworld driving scenarios under various actions, including hazardous non-expert ones. To close the gap between high-fidelity simulation and applications that require reward signals to judge different actions, we introduce a Video2Reward module that estimates a reward from ReSim's simulated future. Our ReSim paradigm achieves up to 44% higher visual fidelity, improves controllability for both expert and non-expert actions by over 50%, and boosts planning and policy selection performance on NAVSIM by 2% and 25%, respectively.

large language model, machine learning, reinforcement learning, (21 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Transportation > Ground > Road (0.66)
Automobiles & Trucks (0.66)
Information Technology > Robotics & Automation (0.43)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

Towards foundational LiDAR world models with efficient latent flow matching

Neural Information Processing SystemsJun-23-2026, 00:16:25 GMT

LiDAR-based world models offer more structured and geometry-aware representations than their image-based counterparts. However, existing LiDAR world models are narrowly trained; each model excels only in the domain for which it was built. This raises a critical question: can we develop LiDAR world models that exhibit strong transferability across multiple domains? To answer this, we conduct the first systematic domain transfer study across three demanding scenarios: (i) outdoor to indoor generalization, (ii) sparse-to dense-beam adaptation, and (iii) non-semantic to semantic transfer. Given different amounts of fine-tuning data, our experiments show that a single pretrained model can achieve up to 11% absolute improvement (83% relative) over training from scratch and outperforms training from scratch in 30/36 of our comparisons. This transferability significantly reduces the reliance on manually annotated data for semantic occupancy forecasting: our method exceeds previous baselines with only 5% of the labeled training data of prior work. We also observed inefficiencies of current generative-model-based LiDAR world models, mainly through their under-compression of LiDAR data and inefficient training objectives. To address these issues, we propose a latent conditional flow matching (CFM)-based framework that achieves state-of-the-art reconstruction accuracy using only half the training data and a compression ratio 6 times higher than that of prior methods. Our model also achieves SOTA performance on semantic occupancy forecasting while being 1.98x-23x more computationally efficient (a 1.1x-3.9x

artificial intelligence, forecasting, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

SPARTAN: ASparse Transformer World Model Attending to What Matters

Neural Information Processing SystemsJun-22-2026, 23:51:53 GMT

Capturing the interactions between entities in a structured way plays a central role in world models that flexibly adapt to changes in the environment. Recent works motivate the benefits of models that explicitly represent the structure of interactions and formulate the problem as discovering local causal structures. In this work, we demonstrate that reliably capturing these relationships in complex settings remains challenging. To remedy this shortcoming, we postulate that sparsity is a critical ingredient for the discovery of such local structures. To this end, we present the SPARse TrANsformer World model (SPARTAN), a Transformer-based world model that learns context-dependent interaction structures between entities in a scene. By applying sparsity regularisation on the attention patterns between objectfactored tokens, SPARTAN learns sparse, context-dependent interaction graphs that accurately predict future object states. We further extend our model to adapt to sparse interventions with unknown targets in the dynamics of the environment. This results in a highly interpretable world model that can efficiently adapt to changes. Empirically, we evaluate SPARTAN against the current state-of-the-art in object-centric world models in observation-based environments and demonstrate that our model can learn local causal graphs that accurately reflect the underlying interactions between objects, achieving significantly improved few-shot adaptation to dynamics changes, as well as robustness against distractors.

artificial intelligence, causal graph, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Curious Causality-Seeking Agents in Open-ended Worlds

Neural Information Processing SystemsJun-22-2026, 23:48:28 GMT

When building a world model, a common assumption is that the environment has a single, unchanging underlying causal rule, like applying Newton's laws to every situation. However, in truly open-ended environments, the apparent causal mechanism may drift over time because the agent continually encounters novel contexts and operates within a limited observational window. This brings about a problem that, when building a world model, even subtle shifts in policy or environment states can alter the very observed causal mechanisms. In this work, we introduce the Meta-Causal Graph as world models for open-ended environments, a minimal unified representation that efficiently encodes the transformation rules governing how causal structures shift across different latent world states. A single Meta-Causal Graph is composed of multiple causal subgraphs, each triggered by meta state, which is in the latent state space. Building on this representation, we introduce a Causality-Seeking Agent whose objectives are to (1) identify the meta states that trigger each subgraph, (2) discover the corresponding causal relationships by agent curiosity-driven intervention policy, and (3) iteratively refine the Meta-Causal Graph through ongoing curiosity-driven exploration and agent experiences. Experiments on both synthetic tasks and a challenging robot arm manipulation task demonstrate that our method robustly captures shifts in causal dynamics and generalizes effectively to previously unseen contexts.

Add feedback